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ABSTRACT 

Context. The implementation of fringe tracking for optical interferometers is inevitable when optimal exploitation of the instrumental 
capacities is desired. Fringe tracking allows continuous fringe observation, considerably increasing the sensitivity of the interfero- 
ns) ■ metric system. In addition to the correction of atmospheric path-length differences, a decent control algorithm should correct for 
disturbances introduced by instrumental vibrations, and deal with other errors propagating in the optical trains. 
Aims. In an effort to improve upon existing fringe-tracking control, especially with respect to vibrations, we attempt to construct con- 
trol schemes based on Kalman filters. Kalman filtering is an optimal data processing algorithm for tracking and correcting a system 
on which observations are performed. As a direct application, control schemes are designed for GRAVITY, a future four-telescope 
near-infrared beam combiner for the Very Large Telescope Interferometer (VLTI). 
i-^ , Methods. We base our study on recent work in adaptive-optics control. The technique is to describe perturbations of fringe phases in 
ft. terms of an a priori model. The model allows us to optimize the tracking of fringes, in that it is adapted to the prevailing perturbations. 
Since the model is of a parametric nature, a parameter identification needs to be included. Different possibilities exist to generalize to 
«"* j ' the four-telescope fringe tracking that is useful for GRAVITY. 

Results. On the basis of a two-telescope Kalman-filtering control algorithm, a set of two properly working control algorithms for 
four-telescope fringe tracking is constructed. The control schemes are designed to take into account flux problems and low-signal 
baselines. First simulations of the fringe-tracking process indicate that the defined schemes meet the requirements for GRAVITY 
and allow us to distinguish in performance. In a future paper, we will compare the performances of classical fringe tracking to our 
Kalman-filter control. 

^- . Conclusions. Kalman-filter based control schemes will likely become the next standard for fringe tracking, providing the possibil- 

ity to maximize the tracking performance. The results of the present study are currently being incorporated into the final design of 
CH ■ GRAVITY. 
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*vi . 1 . Introduction value, a command is calculated by the OPD controller, which is 

__i . ,, . , . . then sent to an OPD-correcting actuator system. The result is that 

Ever since the early days of optica interferometry, the motion fringes ^ ^ ^^ beam combiner ^ stabilized t0 atmo . 



X 



of fringes due to atmospheric turbulence has been a major km- heric motion> and can be inte ted much j than the t 

itation to this observation technique. One possible way to avoid ^ atmospheric cohere nce time. In addition, the synchronous 



the blurring of fringes on the detector is to use short fringe ex- ^^ of the frin on a bri ht source allows the interfero . 

£ , posures, since short integration times temporally freeze atmo- metric observation of much fainter sources 
spheric motion. In combination with the low optical throughput 

of interferometric systems, however, the sensitivity associated A P art fr ° m turbulence, an important limitation of current 
with short integrations is low, even on the largest-telescope sys- real-time wavefront correction or tracking is instrumental vibra- 
tems tions. Initial on-sky tests of NAOS, the first adaptive-optics (AO) 
Tests with prototype i nterferometers (IShao & StaelinlU980l s y stem on the Very Large Telescope (VLT), attri buted a loss in 
for a further overview, see lMonnierll20r]l demonstrated the u se Streh l ratio of up to 1 5 % to mechanical vibrations (|Rousset et al.| 
of the control technique/ringe tracking to increase the sensitivity I2QQ3|). In the case of the VLT Interferometer (VLTI), longitu- 
of interferometric observations. Correcting for the atmospheri- dinal vibrations on the level of the baselines cause fringes to 
cally introduced optical path difference (OPD) between different mov e onto the detector. Power spectra of sequences of OPD 
interferometer arms indeed allows the continuous observation of values estimated during a run on the VLTI/PRIMA instru- 
the fringes. In a fringe tracker, a phase sensor estimates the OPD ment 0elplancke| |2008|) indicate the presence of a discrete 
corresponding to the current fringe exposure. On the basis of this number of vibration components during fringe observations 
dSahlmann et alj|2009h . In an effort to reduce the vibration level 
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differ e nt VLT subsystems has been initiated (lHaguenauer et al.l 
l2008t iPoupar et al.1 l2.Q10h . The efforts to isolate noisy instru- 
ments has led to significant improvements. 

Active correction techniques can also be applied to reduce 
vibration contribution. On the one hand, a vibration-sensing ac- 
celerometer system operates on the first three mirrors of the four 
8.2-m unit telescopes ( UTs) and effectively corr ects vibrations 
in the range 10-30Hz (lHaguenauer et al. 1 120081) . On the other 
hand, the control algorithm of the OPD controller can be ex- 
tended with additional vibration blocks that correct for a defined 
set of vibration components . The Vibration TracKing algorithm 
(VTK; iDi Lieto et alj 120081) is such an example that operates 
as a narrow-band filter. For the Keck Interfero meter, a similar 
narro w-band control system has been designed (IColavita et alj 
l2(5Toh . 

In recent years, there has been an effort in AO to design new 
control schemes. Of s pecial importan ce are the techniques based 
on Kalman filtering dKaimanlll960l) . which is a technique de- 
signed for general data processing. It is used to track and correct 
observed systems in a statistically optimal way, in that it min- 
imizes the norm of the r esidual signal (for line ar systems with 
Gaussian noise statistics). iLe Roux et alJd2004l) designed an AO 
control scheme based on Kalman filtering. One of the imp ortant 
chara cteristics of this scheme is that it can be extended f.Petitj 
120061) to coherently treat turbulence and vibrations, rather than 
add extra vibration control blocks to an unsuitable classical con- 
troller. As usual for Kalman filters, it is based on an a priori 
model for the controlled system, which here is referred to as tur- 
bulence and vibration perturbations. 

The purpose of this work is the design of Kaiman-filtering 
control schemes for fringe tracking, starting from the work 
that has been done for AO control. In particular, we consider 
the case of four-telescope interferometry, and develop control 
schemes that w ill be applicable to the fringe-t racking subsystem 
of GRAVITY ( lEisenhaueretai]|20TTl 12001 . which is a four- 
telescope beam combiner to be installed on the VLTI. The main 
purpose of this second-generation instrument is high-precision 
narrow-angle astrometry and phase-referenced interferometric 
imaging in the near-infrared K band. The main scientific target 
of GRAVITY is Sgr A*, which is currently the most well-studied 
and likely supermassive black hole, located at the Galactic cen- 
ter. On-sky operation is expected in 2014. 

Conceptually, the idea of using Kalma n filtering to correc t 
for atmospheric fringe motion dates back to Reasenb ergl(ll990h . 
but has never been implemented under that form. A simple 
Kalman-filter based strategy for single-baseline fringe tracking 
was effectively used at the Palomar Testbed Interferometer (PTI, 
IColavita et alj|1999l) . which to our knowledge, is the only ex- 
ample of on-sky fringe tracking using Kalman-filter techniques. 
lLozi et all j2010h demonstrated the operation of Kalman filter- 
ing for vibration correction on a laboratory prototype for single- 
baseline space-based nulling interferometry. 

The use of Kalman filtering for (ground-based) fringe track- 
ing will have several advantages compared to classical control 
strategies (e.g. integrator control): 



- Kalman filtering allows us to control turbulence and longitu- 
dinal vibrations in a coherent and equivalent way: the contri- 
bution of different perturbation components is decomposed. 

- All information on disturbances is used via an a priori 
model. This makes Kalman filtering a statistically opti- 
mal control method when considering linear systems with 
Gaussian white noise statistics. 



- The use of Kalman filters may solve issues with ex- 
isting active vibration control at VLTI (e.g. artifacts in 
frequency spectra of residual VTK-corrected data; see 
lHaguenauer et al.ll2008l) . 

- The predictive nature of the filter allows us to deal with the 
loss of observables due to measurement noise. 

In the case of the latter point, fringe-tracking control schemes 
need to be robust against flux losses owing to imperfect beam 
combination. Failures in tip-tilt correction, for instance, can lead 
to the short-time absence of fringe signal. The prediction capa- 
bility of the Kalman filter controller is leveraged for this purpose. 
The outline of this work is the following. In Sect. [2] we 
present the AO Kalman-filter control scheme, translated into 
a form applied to two-telescope fringe tracking. The Kalman 
filter is based on a parametric model and we present a tech- 
nique for parameter identification in Sect. [3] Since an aim of 
this work is to develop fringe-tracker control schemes applicable 
to GRAVITY, we extend the control in Sect. |2]to four-telescope 
systems (Sect.|4]i. After considering some specific improvements 
in Sect.[3J we introduce the complete control schemes in Sect. [6] 
Before concluding this work, the results of the first simulations 
of four-telescope Kalman-filter fringe tracking for GRAVITY 
are presented (Sect. [7). 



2. Basic case: two-telescope Kalman-filter control 

The input information provided for fringe-tracking control are 
the OPD values estimated by the phase sensor. The aim of the 
OPD controller is to give an appropriate output command to the 
piston actuators (piezo actuators and delay lines). This section 
introduces the control law that is appropriate for a two-telescope 
interferometer. The approach used is based on a formalism de- 
veloped for AO modeling, hence we can directly apply it here to 
fringe trac king. For a more comp lete introduction to this model, 
we refer to lLe Roux et al.l(l20o4 and lPetitl (120061) . which are the 
papers on which the current section is bas ed. The notation we 
use is (mainly) that of lMeimon et alj j2010h . 



2.1. Modeling the fringe-tracking control 

We have already mentioned before that Kalman filters need a 
model for the controlled system. We successively introduce the 
model for the fringe tracking and the evolution of the distur- 
bances (turbulence and vibrations). 



2.1 .1 . Description of the fringe-tracking process 

Turbulence and longitudinal vibrations add up to introduce OPD 
perturbations. The OPD in a two-telescope interferometer can be 
related to a phase value ip as 



<p = tp** + Y J <p vihi = 



In . 

T 



(1) 



In reality, the quantity y observed by a fringe tracker is not ex- 
actly ip, but contains two additional components. The first of 
these is an additive measurement noise w. Secondly, a fringe 
tracker in operation will apply correction phases, or commands, 
u using the piston actuators. We can therefore express the mea- 
sured value of the fringe tracker as 



y = ip — u + w. 



(2) 
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In what follows, we refer to y as the residual OPD. For simplic- 
ity, we ignore conversion factors, e.g. 2n/A, and assume that all 
quantities are expressed in the same units (say, microns)Q 

To get into the problem of discretization, we denote by T the 
elementary integration time of the fringe tracker. The discretized 
variables are then defined in the following way: 

- <p' n is defined as the average phase of the disturbance compo- 
nent i during the time interval [(« - \)T, nT]: 



1 J(n-l)T 



dt<p'(t). 



(3) 



The sum of all these phases is referred to as ip n . 

- The command is a fixed value during one integration (as- 
suming that the response time of the actuators is much 
shorter than T). We therefore define u„ as the command 
applied at time step n, i.e. applied during the time interval 
[nT,(n+l)T]. 

- The final parameter, the noisy component w„ at time step n, 
could in principle be defined in a similar way as ip l n . In this 
model, however, w will be considered as simple zero-mean 
white Gaussian noise that adds to y. The standard deviation 
is fixed to <r w , which makes the values w„ independent of 
other variables. 

The assumption that the data processing after integration 
amounts to one time step T leads to the following discretized 
version of Eq. (0: 



y n - <Pn-\ ~ W„-2 + W„. 



(4) 



The quantity y„ is thus the measurement that is available at time 
step n. We include a schematic representation of the fringe mea- 
surement model in Fig.Q] 

A final note to this model for the fringe-tracking process con- 
cerns the notion of a fixed measurement error o~ w . Further on in 
this work (Sect. |5}, we introduce an adaptive weighting accord- 
ing to the (variable) signal-to-noise ratio (S/N) conditions. The 
current assumption of a fixed cr K is only for clarity reasons and 
not for the actual implementation. 

2.1 .2. Description of the OPD evolution 

The formalism of Kalman filtering that is used to estimate the 
commands, requires an a priori model not only for the mea- 
surement process, but also for the evolution of the observed sys- 
tem, that is, of the turbulence and the longitudinal vibrations. In 
essence, this requires the description of new states of the system 
in terms of previous realizations. 

The OPD spectra obtained at the VLTI typically contain a 
number of peaks (e.g lSahlmann et aTll2009l) . which are associ- 
ated with vibration modes excited by a variety of processes. In 
the following, we can assume each mode to be independent. The 
simplest way is then to describe the dynamics of such a vibration 
mode i is to consider it as a discrete damped o scillator, ex cited 
by a Gaussian white noise v. It can be shown dPetiti r2006) that 
the one-dimensional equation of motion for a vibration mode i 
of natural frequency fl and damping coefficient k' can be dis- 
cretized into the recursive equation 






_vib j" vibi 



vib/' vib i 



<Pn + a 2 <Pn-\ + V - 



1 More generally, one can write y 
factors D and N. 



(5) 
D(<p- Nu) + w, with conversion 



(n - 2)T 
I 



(n - 1)T 
I 



nT 



nT 



(n + DT 



integration 

measurement y n command u n 
Fig. 1. Time diagram of the fringe tracking process. 



where 



2e -2nk<f T cos (27r/ ( ;rVi-fc'' 2 ), 



a^ bi = -e- AKk ' f '° T . 



(6) 

(7) 



Eq. (0 has a form that is better known as an autoregressive 
model of order two, AR(2). We note that the conversion from 
continuous to discrete involves a Taylor approximation. 

The modeling of longitudinal vibrations as randomly-excited 
damped oscillators leads in a natural way to an AR(2) descrip- 
tion. One may ask whether the contr ibution of turbulence can 
also be described as an AR(p) process. ILe Roux et al.l(l2004l) use 
an AR(1) turbulence model, based on temporal evolution charac- 
teristics of the phenomenon and basic parameters such as wind 
speed and the telescope diameter. A s lightly more a dvanced first- 
order temporal model is discussed in lLoozel(l2007l) . 

When taking a damping coefficient k > 1, the previ- 
ous AR(2) vibration model can also be used to describe non- 
res onating signals , as in the case of turbulenceQ This property 
led iMeimon et al.l (120101) to use an AR(2) model for the turbu- 
lence description. We follow their argumentation and write 



~tur 
fn+1 



i -iur , _iur „ur , ,,l 
Vn + a 2 <Pn-\ + V n 



(8) 



2.2. State-space description of the system control 

The state-space representation is a general framework to de- 
scribe problems involving an evolving system on whi ch mea- 
surem ents are performed (for a general introduction, see iDurbanl 
120041) . Under the assumption of linearity and time invariance 
(LTI state-space models), the model has the form of two recur- 
sive (constant) matrix equations: one describes the evolving sys- 
tem and the other the observational process on that system. 

At least qualitatively, we have good indications that the con- 
sidered problem can be described as an LTI problem. On the 
one hand, the states of turbulence and vibrations can be con- 
sidered as realizations of a time-invariant system (for reason- 
able timescales). On the other hand, the OPD measurements are 
clearly a form of observation of that system. Using standard 
state- space notation, we c an cast Eqs. (@J, (|5), and © into the 
form dMeimonetal .11201 Oh 



x„ + i = Ax„ + v„ 
y„ — Cx„ — m„_ 



[equation of state], (9) 

[observation equation] . (10) 



where (for the sake of the example, we assume two vibration 
components, vib 1 and vib 2) 



_T / tur tur vib 1 vib 1 vib 2 vib 2 \ 
x « = ifn fn-l Vn <P n -l Vn <P n -\ )' 


(ID 


v T = ( v m v vibl v vib2 0), 


(12) 


C = (0 1 1 1), 


(13) 






2 The complex identities Vl - k 2 = j V& 2 - 1 and cos(j0) = 


= cosh# 



are used to compute a\ in Eq. 0. 
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(14) 



The vector x is called the state vector of the system. In addition, 
we denote by S v and Z„, = cr 2 , the covariance matrices of the 
system noise (v) and measurement noise (w), respectively. We 
note that for the current example, I v is of the form 



Z v = E{vv T ) = diag(t7^ r ,0,< bl ,0,tr™ 2 ,0r. 



(15) 



As usual in state-space representation, the state variable x„ is 
a hidden quantity: only y„ is observed. It is important to high- 
light the block-diagonal form of the matrix A: every perturba- 
tion component (turbulence and vibrations) contributes a single 
AR(2) block. 

2.3. Control: the (asymptotic) Kalman filter 

Kalman filtering dKalmanl 19601) is a technique designed for gen- 
eral data processing. It is used to track and correct observed sys- 
tems in a s tatistically optimal w ay, in that it minimizes the resid- 
ual signal ( IChui & Chenl2009l) . Kalman filters start from a linear 
state-space description of the system and provide estimations of 
the system's (hidden) state vector derived from the observations. 
We denote by x„|„< the estimation of the state vector at 
time step n, based on all observations up to time step n' (i.e., 
yo, . . . ,y„>)- Given the state-space model in Eqs. (|9]i and ( fTOb , the 
Kalman filter equations^] take the form 



x n|n — x n|«-l + Goo^n ~~ Cx„|„_i + M n _2), 
x «+l|n — AX„| n . 



(16) 
(17) 



The former of these equations updates the information about the 
disturbance state x„, taking into account the measurement y„ that 
has become available at time step n. The latter then predicts the 
state at the subsequent moment in time. 

The matrix G M in Eq. ( fTSI l is the (asymptotic) Kalman gain 
matrix, calculated as 

Goo = ZcoC T (CZooC T + Z W T\ (18) 

where Z M is the solution of the matrix (Riccati) equation 

loo = AZ M A T - AZooC T (CZooC T + Z U ,) 1 CZ K) A T + Z v . (19) 

The gain matrix determines the weight by which the new mea- 
surement y„ is taken into account for updating our estimate of 
the state vector x„. The mathematical form of Goo can be derived 
by r equiring that the exp ectation value E{(x„ - x„|„) 2 } is minimal 
(e.g. lChui & Chenll2009l) . We note that the gain matrix contains 
the noise characteristics (I,,, Z„), making it optimally adapted to 
the considered system. 

The final piece of information that is needed to design a two- 
telescope control scheme is an equation to determine commands. 
An optimal command will minimize the observed residual OPD 
y„, which was given by Eq. ( fTOb - The command u needs to com- 
pensate the x-part (w is the noisy part, which cannot be reduced 



3 Note that Eqs. dl6t and dl7t represent the asymptotic Kalman equa- 
tions. The non-asymptotic equations can be found e.g. in lChui & Chenl 

(2009). 
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Fig. 2. A general perturbation power spectrum consists of a tur- 
bulence spectrum, a flat (measurement) noise spectrum and vi- 
bration peaks. 



by u). It is not difficult to see that the most appropriate choice of 
u is 

u„ = Kx„ + i i; „ (20) 

where (still for the two-vibration example) 

K = (l 1 1 0). (21) 

This result forms the final equation needed for control. 

3. Parameter identification 

The state-space description that we need to model the fringe- 
tracking process contains a possibly large number of parameters. 
Reconsidering the state-space model in Eqs. © and ( fTOt . we see 
that the parameters to identify are 



{(a\,a' 2 , <t\) I Vi e {disturbances}} and cr H 



(22) 



Once the parameters are obtained, we can "fill" the system ma- 
trices, and apply the Kalman filter. 

We assume that OPD measurements from the phase sen- 
sor are the only available piece of information about the distur- 
bances. In other words, it is assumed that no other a priori infor- 
mation about the disturbance components is available (e.g. from 
other observation techniques). 

3.1. The spectral method 

The method we use to calculate model pa rameters is based on 
the one proposed by iMeimon et al.l (I2010J hereafter the MPFK 
method). The MPFK method is a power-spectrum-based routine 
designed for the parame ter identification o f the Kalman-filter AO 
control scheme of Le Roux etail d2004 and iPetiJ d2006), the 
scheme that lies at the basis of Sect. [2] We refer to the original 
MPFK paper for details of the method, and give here a brief 
summary and some information on our modifications. 

The operation of the MPFK method is based on modeling the 
(estimated) power spectrum of an observed perturbation-phase 
sequence. In a time interval before identification, a sequence 
of closed-loop data is obtained and corrected for the applied 
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commands (pseudo-open loop data, POL). In the notation of the 
state-space model, a sequence of data 



{y p „ OL = y n + Un 



0. 



,N-1) 



(23) 



is collected. The periodogram of this sequence contains the im- 
print of the different perturbation components that contribute to 
the phase signal (Fig. |2). In the first fitting step, the measure- 
ment noise is estimated from the flat tail of the periodogram. 
By stepwise adding AR(2)-model spectra to the estimated noise 
spectrum, first for the turbulence and then for the vibration com- 
ponents, the global power spectrum is assembled until a stop- 
ping criterion is satisfied. The iterative fitting itself is based on 
the maximization of the likelihood function associated with each 
model periodogram. We note that the MPFK method is designed 
to operate in non-real time, i.e. outside the real-time tracking 
loop. 

In the standard MPFK method, a vibration-peak fitting step 
makes use of a large parameter grid for the three parameters 
associated with a single vibration component {a' v a l 2 ,cr[). The 
method uses no a priori information about the vibration com- 
ponents, but simply maximizes in each iteration the likelihood 
criterion corresponding to the given grid. In other words, the 
fitting of vibration peaks is done in a "blind" way on a large 
grid. To limit the computation time, we decided to add a sim- 
ple subroutine that selects peaks from the power spectrum. In 
every iteration, we flag the points that differ significantly from 
the estimated power spectrum resulting from the previous itera- 
tion. Picking each time the highest peak allows us then to use a 
much smaller grid around a first guess for the peak parameters, 
which is made by estimating the maximum, width, and central 
frequency of the peak. In addition to the gain in computation 
time, peak picking limits the risk that multiple peaks are being 
fit with a single large peak, degrading the quality of the fit. We 
note that our peak-picking method selects peaks based on their 
height, rather than their energy. 

3.2. Practical application 

In the MPFK paper, it is shown that N = 2000 is a proper choice 
for the length of the POL-data acquisition. Since POL data is 
constructed from closed-loop data, the acquisition of data se- 
quences does not require the loop to be (re)opened. This prop- 
erty is advantageous for the operation of the fringe tracker and 
opens perspectives to perform updates of the model parameters 
in closed loop. There is however a fundamental difference be- 
tween the case of AO, for which the MPFK method was de- 
signed originally, and interferometry. In the former case, the ini- 
tialization of the Kalman filter can be based on a sequence of 
open-loop data (no POL data is available at initialization). In 
the case of fringe tracking, however, it is impossible to obtain a 
sequence of open-loop data without taking the risk that fringes 
are lost. Unlike the case of AO, where wavefront observables re- 
main well-defined at the level of the wave-front sensor (e.g. mi- 
crolens images for a Shack-Hartmann wavefront sensor), uncor- 
rected fringes have a typical root mean square (rms) motion that 
largely surpasses typical detection ranges. 

The initialization of the Kalman-filter controller for fringe 
tracking thus inevitably requires a different strategy than open- 
loop measurements. A straightforward solution is to close the 
loop temporally using a classical control algorithm. POL data 
obtained in this way can then be fitted, and consequently we 
switch from the classical to the Kalman-filter controller. We note 
that this technique assumes that we take into account the differ- 



ence in measurement error: the measurement noise cr w for a clas- 
sical controller, extracted from the POL-data fit, will be higher 
than for the Kalman filter. The measurement error for the lat- 
ter can be estimated based on Eq. ( l59l l. which is introduced in 
Sect. E] 

3.3. Other methods 

The spectral identification method is based on a maximum- 
likelihood method to model periodograms of measurement se- 
quences. Alternative metho ds exist to identify parameters in 
state-space models (see also lMeimon et al.ll2010l) . 

The advantage of the spectral method is that it is rather 
transparent, and gives satisfying results (see Sect. |7). The rea- 
son is that the imprint of the different perturbation components 
can be largely decoupled in frequency space, which makes it 
possible to separate the different disturbances. Future tests on 
real data will have to justify the use of the method for fringe- 
tracking purpose, and allow us to optimize the parameter-grid 
construction. A disadvantage of the method, however, is that it 
is difficult in general to make an iterative fitting routine robust 
against outliers and non-standard situations. A good alternative 
to the spectral-fit method could be a method of a purely alge- 
braic n ature, e.g. similar to the one proposed by iHinnen et al.l 
(120051) . We are currently checking the possibility of using meth- 
ods of this nature for parameter identification, as an alterna- 
tive to the spectral method introduced here. Tests of identifica- 
tion metho ds based on the reconstruct i on of the measured time 
sequences (IShumwav & Stofferi Il982t IZiskand & Hertzl 1 19931) 
were not successful in the current case, basically owing to the 
large number of components that need to be characterized. 



4. Four-telescope fringe-tracking control 

In the two previous sections, we have presented a general and 
complete control scheme for two-telescope fringe tracking. It 
is generally known that two-telescope interferometers provide a 
rather limited amount of information per observation (one point 
in the i/v-plane per wavelength bin, no direct phase information). 
Optical interferometry needs to be extended to include more 
apertures, especially when considering image reconstruction 

The purpose of the current section is to find how two- 
telescope control can be extended to more telescopes. We chose 
to consider the case of four-telescope fringe tracking, in the 
framework of the second-generation near-infrared beam com- 
biner GRAVITY for VLTI. GRAVITY will combine the beams 
of the four UTs (or four 1.8-m Auxiliary Telescopes, ATs) in a 
completely redundant way, i.e. on six baselines. Per elementary 
fringe exposure, the instrument therefore measures six OPDs, 
rather than one OPD for a two-telescope beam combiner. The 
OPDs are corrected by a system of four actuators, i.e. one per 
telescope. We now consider how the six observables can be 
transformed into proper commands for the actuator system. We 
note that this problem is absent in the case of a two-telescope 
beam combiner: one OPD measurement corresponds to one 
command. Our four-telescope approach can easily be general- 
ized to 71-telescope fringe tracking. 



4.1. General modal-based control 

Piston disturbances acting on a four-telescope interferometer can 
be considered as vectors in a four-dimensional space. In this 
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Fig. 3. Graphical representation of the four pistons (top) and 
the four modes (bottom). The four telescopes are represented 
as four pixels per square. Grey zones denote the zero-positions, 
black zones denote positive displacement, and white zones equal 
negative displacement. The modes could for example be labeled 
as tilt, twist, tip, and global piston. They form an equivalent to 
Zernike modes in AO, but for a four-telescope interferometer. 



space, an arbitrary disturbance P can be expressed as 
P = (P° P l P 2 P 3 ) T . 



(24) 



The actual piston disturbances P are not directly observable: 
only piston differences or optical path differences (OPDs) are 
measured. In a similar way to above, we can define OPD vectors 
as 

OPD = (OPD 01 OPD 02 OPD 03 OPD 12 OPD 13 OPD 23 ) T , (25) 

where OPD' J = P' - P' . For simplicity, we have assumed that 
pistons and OPDs are expressed in the same units, say microns. 
The conversion P — > OPD is then given by the matrix equation 



OPD = M P, 



(26) 



where 



M T 



(-i -i-io o (n 

i o 0-1-10 
10 10-1 
1 1 I ) 



It is easy to show that the system in Eq. ( 1261 ) is not invertible. 
Mathematically, the matrix M is of rank three, rather than four. 
Physically, this corresponds to it being impossible to recover the 
global piston P tot = P° + P l +P 2 +P 3 from OPD observations (in 
a similar way to the piston not being observable with a single- 
dish telescope). A four-telescope fringe tracker therefore only 
needs to correct for three net observables. 

Rather than using a description that is based on pistons, we 
perform a matrix transformation, based on a singular value de- 
composition of the matrix M, to construct a set of variables that 
directly isolates the invisible global piston. We define the mode 
four-vector Q (£>' for i = 0,1,2, 3) as 



Q = V T P, 

where 
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(28) 



(29) 



This orthogonal transformation isolates (a multiple of) P into 
(2 3 . The Q-modes form an equivalent to the Zernike modes in 
AO, and their relation to the pistons is graphically represented 
in Fig. [3] The conversion between vectors OPD and Q is then 
given by 



(30) 



Q = H OPD, 




where 
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(31) 



We note that the matrix H automatically puts g 3 to 0, indicat- 
ing again that we have no information about P tot from vectors 
OPD. The three components Q°, Q l , and Q 2 now form a proper 
description of the observable system with three degrees of free- 
dom. A general control scheme based on modes consists then of 
three steps: 

1 . We calculate the residual mode vector yg from the measured 
residual OPD 6-vector y using the transformation in Eq. (l30l 



ye = H y- 



(32) 



Using three controllers (e.g. integrator controllers), we cal- 
culate a proper Q-command Ug to correct the Q-modes. We 
fix m 3 = 0, that is, the global piston is fixed (by convention). 
Transform Ug into the four-component piston command u 
using the inverse transformation of Eq. (128V u = Vug. 
These four commands are sent to the actuators for piston cor- 
rection. 



4.2. Modal Kalman-filter control 

Above, we introduced a properly defined control scheme based 
on a modal approach. We now apply a similar philosophy to a 
Kalman-filter based control model, and find the first appropriate 
control law for four-telescope fringe tracking. 



(27) 4 2.1. Modal state-space model 



Starting from the state-space description in Sect. I2.3I we de- 
rived (see Appendix [A) a full modal state-space model for a 
four-telescope fringe-tracking system, which can be easily gen- 
eralized to n telescopes. Individual OPD measurements are as- 
sumed to be uncoupled in this model. Additionally, all piston 
perturbations are considered as being independent. We note that 
this condition is imposed on two scales. First, on the scale of 
one telescope, we ignore the coupling between different vibra- 
tion components. Secondly, on the scale of all telescopes, we 
assume neither that coupling exists between different telescope 
vibrations, nor that the turbulence perturbation on the pistons is 
couplecQ 

The result of this analysis is the modal state-space model 
(compare with Eqs. (0 and ( fTUt ) given by 



x g,h+i - Axg„ + vg iH , 

yg,n = Cx G ,„ - Ug,„_2 + Wg,,,. 



(33) 
(34) 



4 On scales comparable to or larger than the turbulence outer scale, 
however, it is true that this assumption will break down. 
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The vector quantities have the following definitions: 

Xq: 4-block modal state vector 

yg: modal measurement 4-vector (y 3 „ = 0) 

Uq: modal command 4-vector (u 3 = 0) 

\q: 4-block modal system noise vector, of covariance 

matrix Zg iV 
Wq: measurement noise 4-vector, of covariance Zg (W ; 

note that w 3 Q = 

On the other hand, the matrices A and C are 



A = 



(A 0^ 
A 
A 
A) 



and 



C = 



(C (Ti 

C 

C 

0J 



(35) 



where the blocks A and C have the same structure as the cor- 
responding system matrices in the two-telescope state-space 
model. All scalar quantities (y„, u„, w„) have now become four- 
vectors, and the vector/matrix quantities (x„, A, etc.) have ob- 
tained a four-block structure (xg,„, A, etc.). This is all because 
there are four modes. There are several interesting properties 
about the above result, which we discuss now. We note that we 
first neglect the noise terms. 

First, all four terms in Eq. (l34l have a last component that is 
immediately defined to be (see also matrix C). In other words, 
the global piston is automatically defined as not being observ- 
able. Thus, although it is still a real evolving quantity described 
by the last block component of the vectors in Eq. ( l33l l. we can 
neglect it for the analysis. 

Secondly, when looking at the structure of the state-space 
model and of Eq. d35l ). we see that all modes have a decoupled 
evolution (still neglecting the noise). What is more is that the 
sub-matrices of A are the same, and the same observation holds 
for C. Neglecting thus the noise, we see that the evolution is de- 
scribed by four identical copies of a two-telescope state-space 
model (with the modification that for the last copy the obser- 
vation equation is 0). Modes therefore essentially behave in a 
similar way to the quantity tp of the previous sections^ 

Considering the noise terms. A few things change when we 
also take into account the noise terms. Since modes are calcu- 
lated as combinations of pistons, Eq. (l28l i. or combinations of 
OPDs, Eq. d32l i. the covariance matrices are in general no longer 
diagonal (given our assumption that piston perturbations as well 
as OPD measurements are independent). 

Since we have individual OPD measurements at our dis- 
posal, it is possible to estimate the measurement errors <x'^ as- 
sociated with the measurements of OPD' J . This allows us to cal- 
culate the mode measurement error covariance matrix Zg )W as 



Zg, w - H Z w H , 
where 



I w = diag(a» 0*0*0^0^0?) 



23\2 



(36) 



(37) 



(we refer to Appendix fAb. The evaluation of the covariance ma- 
trix Zg v , however, is a more fundamental problem. The matrix 



5 Applying the strategy described in Appendix lAlto the two-telescope 
case shows that there are two modes: P l - P° and P° + P . The former 
of these is exactly (a multiple of) tp\ the state-space model in Sect.|2]is 
already in a modal form. 



Zj3 >v depends on quantities that require direct piston information, 
which is inaccessible (owing to the unobservability of the abso- 
lute pistons). We neglect the off-diagonal entries; under this ap- 
proximation, it can be shown that Zg jV has a diagonal built of four 
equal diagonal parts (e.g. like the structure of A). In this model, 
the evolution of the four modes by Eq. (l33l is thus genuinely 
independent and equal. 

Convention. Up to now, we have carried the global piston mode 
as a superfluous burden in our matrix equations. Since we as- 
sume that the modes are fully decoupled, we can drop all matrix 
and vector blocks corresponding to Q 3 . Under this convention, 
the fundamental matrices V and H become 



V= - 

2 



(-1 -1 -Y\ 

-1 1 1 

1 -1 1 

1 1 -1) 



and 



H = - 
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-1 


-lj 



(38) 



It is not difficult to generalize this operation to the other quanti- 
ties. For example, the measurement-error covariance matrix Zg jW 
loses its last column and line and becomes a 3 x 3 matrixQ 

4.2.2. Modal Kalman filter 

The non-diagonal nature of the covariance matrix Zg „> shows 
that the modes are not independently measured. It follows that 
the control scheme has to involve one global Kalman filter for 
the three modes, rather than one per mode. The full Kalman- 
filter equations then take the form 



Xg,„|« - Xg,„|„_i + Goo (yg,n _ Cxg jK |„_i + Ug,„_2), 



Xg,n+l|n - Axg,„|„. 



(39) 
(40) 



In this equation, the Kalman gain Goo is calculated as in Eqs. ( fTSl 
and ( fT9l . with the changes A, C, Z v , Z w — > A, C, Zg (V , Zg w , 
Finally, the optimal mode command is calculated as 



Ug,« = Kxg 



n+l|n> 



(41) 



where the matrix K is defined in the same way as C in Eq. (l35l . 
with blocks K defined as in Sect. 12.31 



4.2.3. Practical considerations 

The modal Kalman-filter control is an interesting control 
scheme: it is completely symmetric, its conversion matrices are 
simple, its input and output have the same dimensions (three), 
and (under the above approximations) each mode has exactly the 
same state-space model. An additional advantage of the modal 
scheme is presented in Sect. 14.41 where we consider the identi- 
fication of the parameters in the state-space model described in 
Eqs. d33j and (E3). 

The symmetry involved in the definition of the modes im- 
plies that the control will work most effectively when all other 
involved quantities are symmetric. It is easy to see, however, 
that this is not the case. The design and implementation of the 
GRAVITY system are largely symmetric, whereas actual obser- 
vation conditions (e.g. wind shake, vibrations) and baseline con- 
figurations are not. Whenever a single component in the system 
fails, for example one telescope or one baseline, at least two of 



6 Using Eq. J36l >. it can be shown that the last column and row were 
zeros anyway, exactly owing to their unobservability. 
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the three observables are spoiled in the current control scheme 
(each OPD contributes to two modes). A priori, the loss of ob- 
servables might be expected to have a negative impact on the 
control. We therefore propose a second four-telescope control 
scheme, which is based on the tracking of the individual OPDs. 

4.3. OPD Kalman-filter control 

An alternative to a global Kalman filter that tracks all modes is 
to use six Kalman filters to control each individual OPD. In this 
(redundant) scheme, each telescope pair is considered as a single 
two-telescope interferometer. 

One might wonder why it would be more interesting to track 
twice the number of variables (three modes versus six OPDs). 
The main interest of the modal control is that it was defined in 
a perfectly symmetric way to transform the six observables and 
four pistons into three quantities that are tracked. Here, the con- 
version to and from modes is done using fixed matrices. In the 
OPD control, however, the redundancy of the conversion to and 
from OPDs allows us to choose different ways of defining com- 
mands. In particular, one can consider non-symmetric combina- 
tions of observables or — even better — adaptively weight observ- 
ables with respect to the physical conditions of observations. The 
latter possibility is our main point of interest: whenever observ- 
ing conditions (low flux, low-signal baselines, etc.) make certain 
OPDs unfavorable, we can lower their weight in the command 
calculation. Additionally, it even allows us to let the interferom- 
eter work with only three or two telescopes. 

We postpone a deeper discussion of the above problem to 
Sect. |5] where we also modify the modal scheme to a more ro- 
bust version. For completeness, we already give an appropriate 
method for command-vector calculation. By xopd, we denote the 
six-block OPD state vector, i.e. the state vector corresponding to 
each OPD organized into a column vector. The four-component 
piston command u„ is then calculated as 



u « - M^Kxqpd.h+IIh, 



where K is the six-block version of the previously introduced 
matrix K, in order to operate on the six substate vectors in xopd- 
The matrix MJ^, on the other hand, is a weighted generalized 
inverse to the matrix M (Eq. d27b), calculated as 



M* = (M T W M) 1 " M T W. 



(43) 



In this equation, the 6x6 weighting matrix W, defined further 
on in Eq. ( |47| ). distributes the weight among the different OPD 
substates for the command calculation. 



4.4. Four-telescope parameter identification 

The MPFK method (including the peak-pick extension) can be 
directly applied to two-telescope fringe tracking, for the sim- 
ple reason that it is exactly designed for the corresponding 
state-space model. In the four-telescope case, the sequences of 
y^ 0L (Eq. ( l23l ) are replaced by equivalent sequences of six- 
dimensional vectors y,^ 0L , i.e. the POL OPD measurement six- 
vectors. 

In the modal state-space model, the three observed modes 
are decomposed into the elementary disturbances. Therefore, the 
most logical option is to apply the spectral fit to the time se- 
quences 



where y?,° L are the three-component POL mode measurements 
(we use Eq. (l32li). Under the convention of neglecting the off- 
diagonal entries of the covariance matrix Sg v , we have shown 
in Sect. 14. 2.T1 that the state-space model is the same for the three 
modes. The result of this approximation is not only that we can 
apply the spectral fit to each of the three non-trivial modes, but 
also that we can average the spectra of the three modes and per- 
form one global fit. Indeed, assuming three independent modes, 
the three periodograms are independent realizations of the same 
state-space model. Averaging periodograms for identification 
translates into a modification of the periodogram statistics, and 
thus a new likelihood function for the identification method (see 
Appendix IB1 for some details). The important advantage is that 
the periodogram is a more accurate estimation of the power spec- 
trum, and that even small perturbation components become sig- 
nificantly discernible from statistical noise. 

The OPD Kalman-filter control scheme, on the other hand 
(Sect. 14.31 ), is based on the redundant tracking of all OPDs by 
six two-telescope Kalman filters. In this case, it thus suffices 
to apply the spectral fit to each of the six OPD measurement 
sequences. The disadvantage is, of course, that we have to ap- 
ply the identification method six times, instead of once in the 
modal scheme. Additionally, since we have only one sequence 
per OPD, we cannot apply the advantageous averaging of peri- 
odograms. 

5. Fine-tuning the fringe-tracking control 

Our aim to find appropriate Kalman-filter control schemes led 
us to consider two options: tracking three observable modes in 
a three-component Kalman filter and tracking six OPDs in six 
individual Kalman filters. We now consider some specific cases 
encountered in real observations, and present a way of dealing 
with them in our control schemes. 



(42) 5. 1 . Low S/N baselines 



Different processes can lead to lower-quality OPD measure- 
ments on a given baseline. Examples are non-central fringe mea- 
surements in the fringe envelope and variable splitting ratios of 
the beams. Whenever the fringe amplitude on a certain baseline 
is low with respect to the noise level, the control strategy should 
be designed to maintain the fringe tracking at as optimal a level 
as possible. 

An advantage of the interferometer concept considered 
above is that its redundant architecture enables it to take into 
account low fringe-amplitude baselines, in which different OPD 
combinations can compensate for other OPD measurements 
(e.g. OPD 01 = OPD 02 - OPD 12 ). The key step will be to take 
a weighted recombination of the six baselines, in order to mutu- 
ally improve the raw measurements. 

In essence, we can write the process of weighting the resid- 
ual OPD values as 

yw = Iwy, where \ w = M M^, (45) 

with an associated measurement-error covariance matrix 



5-w,W - Iw^n- liy- 



(46) 



{y^Hy^|» = 0,...,JV-l}, 



(44) 



The 6x6 matrix 1^ combines in a weighted way the OPD 
measurements, which allows us to mutually improve each OPD 
value. The weights W-* attributed to each OPD' J are organized 
in a diagonal weighting matrix W, which is used to calculate the 
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previously introduced matrix ML (Eq. d43ll). In accordance with 
the purpose of weighting, a proper definition of W is 



w = z: 



(47) 



which is the inverse of the (diagonal) measurement-error co vari- 
ance matrix. In other words, we take W'-* = (cr[{.y 2 . We note 
that either global weights W, i.e. fixed weights during operation, 
or instantaneous weights W„, i.e. defined for each OPD mea- 
surement, can be considered. The latter is possible owing to an 
instantaneous measurement error that is provided by the phase 
sensor (we refer to Eq. ((59) in Sect. |7). 

In the modal Kalman-filter control scheme, we now simply 
apply the standard OPD-to-mode transformation in Eq. (l32t to 
the weighted modes yw, and accordingly use a weighted mea- 
surement covariance matrix 



-Q,w,W 



H I H ,jy H . 



(48) 



For the OPD scheme, the six Kalman filters are applied to the six 
weighted OPDs. We note that in this scheme, the weighting is as- 
sumed not to introduce coupling into the OPDs. In other words, 
we ignore off-diagonal terms in the covariance matrix Z,,,^ in 
the OPD scheme. This allows us to keep using the 6 indepen- 
dent Kalman filters, and will prove useful for the adaptive-gain 
definition below. 

A final note on this OPD weighting is that it is only use- 
ful for non-resolved fringe-tracking targets. When the target is 
(significantly) resolved, the intrinsic visibility phases may be 
non-zero and hence phase closures cannot be assumed to be 
(e.g. OPD 01 + OPD 12 - OPD 02 * 0, in general). In any case, 
fringe tracking of significantly resolved objects is disadvanta- 
geous owing to the lower fringe contrasts. 



5.2. Flux dropouts 

OPD weighting resolves the issue of low S/N's on individual 
baselines. A second problem occurs when the flux acquired at 
a telescope is low, i.e. the so-called flux dropouts. The notion 
of flux dropouts contains the different processes by which the 
number of photons coming from a certain telescope drops. These 
include: 



a Kalman filter, is that it allows us to continue operating based 
on the state-space model. 

In AppendixICl we argue for the necessity to modify the gain 
to take into account flux problems at the level of a telescope. 
In essence, we modify the constant Kalman gain and allow it 
to decrease when observing conditions on associated telescopes 
degrade. The result is that the estimation of the next state is based 
more on the model, i.e. the observed (noisy) residual OPD is not 
taken into account as effectively. 

To introduce the instantaneous-gain definitions, we first de- 
fine the instantaneous versions %.Q, Wt w,n an d Z»,vk« °f tne previ- 
ously introduced noise-covariance matrices (Eqs. (@6) and d48l). 
The instantaneous covariances are recalculated at each time step 
using instantaneous weighting matrices W„ (rather than a con- 
stant W). The discussion in Appendix ICl then leads us to the 
instantaneous-gain definitions 



r _ tr{I 6 ,„,,iy} 
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("tr" denotes taking the trace) and 
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[OPD scheme] 



(49) 



(50) 



(the index ij to the parentheses indicates that we take the matrix 
diagonal element corresponding to the considered OPD u ; for in- 
stance, for ij = 01 and ij = 12, we take the zeroth and third 
diagonal element, respectively). 

The advantage of the OPD scheme is now clear. Here we 
consider the extreme case when we lose all information provided 
by one telescope. In the modal scheme, the loss of information 
immediately means that we lose all mode observables, which 
implies that we cannot take into account the observational infor- 
mation at that step (the instantaneous gain matrix becomes zero). 
On the other hand, losing a telescope in the OPD scheme means 
we still have three OPDs that are properly measured, and the 
gain for those Kalman filters can stay fixed. This implies that the 
command is still based on half the number of usual observables 
(versus none for the modal scheme). The cost we have to pay is 
that we need to track six variables instead of three. 



- Atmosphere-related effects, where scintillation and other tur- 
bulence effects can lead to a failure of the AO system. The 
resulting Strehl ratio can be insufficient for proper fringe for- 
mation and detection. In addition, cirrus may lower the flux 
arriving at the focal plane of a telescope. 

- Errors in the fiber injection, where mechanical problems 
(e.g. vibrations) and other propagating tip-tilt errors in the 
optical train can lead to problems in the injection of the 
beams into the fibers, coupled to the beam combiner. 

As far as the second point is concerned, GRAVITY will have 
a fib er coupler subsys tem to perform an internal tip-tilt correc- 
tion dPfuhl et alj|2010l) . Owing to the residual errors, however, a 
specially developed fringe-tracking strategy is still required. 

It should be clear that the problem of flux dropouts is very 
different from the problem introduced in the previous section. 
When the flux drops, it is impossible to compensate for the low 
signals on the baselines associated with that telescope: all these 
baselines are affected in a similar way. For non-predictive con- 
trol schemes, no other solution exists apart from waiting until the 
lost signal re-emerges, while keeping the command for that tele- 
scope fixed. The advantage of using a predictive method, such as 



5.3. Piston isolation 

For some specific reasons, it might be interesting to completely 
decouple one or more telescopes from the command estimate. 
This can occur when a telescope (or its AO system) temporally 
becomes very unreliable during an observing run. The flexibility 
of the OPD scheme allows us to track blindly on one/two tele- 
scope^) without affecting the tracking of the others, after which 
we can switch back to the usual scheme. 

To temporally decouple telescope i, we attribute a weight 
zero to all associated baselines to telescope i. Doing so, how- 
ever, results in putting the command on telescope i to 0. This 
can be disadvantageous to the stability of the system, as it gen- 
erally involves a large jump in all actuator positions (the average 
actuator position is always 0, hence all actuator positions change 
when one command is put to 0). The basic idea is then to add an 
extra component to the command vector, 



u ( , - ML KxopD,fl+ii« + L' Kxopd,«+ii«- 



(51) 



where the components of xopd,b+1|h that correspond to i are 
blindly estimated (gain 0). The newly-introduced matrix L', on 
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the other hand, has a specific structure that adds a blindly esti- 
mated command to i, and subtracts an equal value from all other 
commands such that the sum of the commands is zero. For in- 
stance, for i - (i.e. to isolate P°) and i - {1,2} (i.e. to isolate 
both P l and P 2 ), we have 



L u = ^ 



( 3 

-10 

-10 

-1 0J 



and 



L»-i 



(0 -1 -1 0\ 

2 

2 

-1 -1 0) 



, (52) 



respectively. As we subtract an equal command from each non- 
decoupled actuator, these commands on the decoupled tele- 
scope^) do not affect the non-decoupled telescopes. 



6. Full Kalman-filter operation 

We have introduced the basic theory, developed several control 
schemes, and considered strategies that allow us to minimize the 
problems associated with imperfections of the system/control 
schemes. It is a good point now to define the full Kalman-filter 
operation by comparing the two schemes. The operation is split 
into an identification phase and a (real-time) tracking phase, the 
former being needed to find the parameters. 



Identification phase. 



1 . Pseudo-open loop (POL) data acquisition: a set of data points 
and estimated measurement errors is obtained in closed loop 
and corrected for the applied commands to get a POL se- 
quence. 

2. Weighting: we estimate a global measurement noise for each 
POL OPD sequence by taking the median of the measure- 
ment errors. We apply the weighting in Eq. (l45t to get the 
weighted OPD sequence. In addition, for the modes, apply 
the OPD-to-mode transformation in Eq. (l32t to calculate the 
weighted mode sequence. We also calculate the weighted co- 
variance matrices using Eq. (|46*1 i or (|48V 

3. Spectral fit: the spectral-fit method is applied to either the av- 
eraged mode periodograms (modal scheme) or the individual 
weighted OPD periodograms (OPD scheme) to get the fit pa- 
rameters. 

4. Filling of the Kalman matrices: the fitted parameters are used 
to define the system matrices, and to calculate the gain G ra . 

Tracking phase. 

For each elementary fringe exposure: 

1. Weighting: the residual OPD estimates for the six baselines, 
obtained from the phase sensor, are combined in a weighted 
way to give optimal residual mode estimates (modal scheme) 
or OPD estimates (OPD scheme). The weighting factors are 
calculated from instantaneous estimates of the measurement 
errors. We note that this step requires a singular-value de- 
composition to calculate the weighted generalized inverse 
M w of M. We calculate the associated instantaneous mea- 
surement covariances Z.Q, w ,w,n or ^w,w,n (Eq. (l46b or (l48l i with 
instantaneous weights). 

2. Kalman step: we apply the Kalman filter, using the instanta- 
neous gain definitions (Eq. (l49l or d50ll). 

3. Command calculation: we calculate the actuator commands. 
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Fig. 4. Power spectra used to simulate the perturbations on the 
four pistons (four colors). The turbulence-only spectrum is indi- 
cated by the dashed line. 

7. Simulations for VLTI/GRAVITY 

To test the performance of our Kalman-filter control, we con- 
struct a simple simulator of the control process in a fringe 
tracker, which also allows us to distinguish between the dif- 
ferent control schemes. The simulations are performed in func- 
tion of the key science case of the four-telescope beam com- 
biner GRAVITY: the astrometric o bservation of the Gala ctic - 
center neighborhood. As described in lGillessen et al.l(l2010l) . sta- 
ble fringe tracking of a magnitude K — 10 star is required, with 
a specification of 300 nm for the residual OPD. 

More complete simulations of the fringe tracking process 
for GRAVITY, includin g frame-by-frame ac quisition, are post- 
poned to a future paper (IChoquet et al.ll2012T, in preparation) . In 
the same paper, a comparative study with classical control algo- 
rithms will be presented and loop parameters will be optimized. 

7.1. Disturbance simulation 

Data is simulated based on model power spectra for the asso- 
ciated phenomena. Since turbulence and vibration components 
are assumed to be mutually independent, we can sum individual 
perturbation sequences to assemble complete data sequences. 

For each data sequence, we take a number of vibration com- 
ponents under the form of randomly excited damped oscilla- 
tors. The frequencies chosen are inspired by OPD power spec- 
tra obtained with the VLTI/PRIMA fringe tracker, shown in 
ISahlmann et al.l(l2009l) . The total rms deviation per piston is cho- 
sen between 200 and 240 nm, w hich is a realistic es timate of the 
current vibration level at VLTI dPoupar et al.ll2.Q10l) . The vibra- 
tion profiles used for each telescope are shown in Fig. 2](super- 
imposed on the turbulence profile), with some specifications in 
TableQ] 

Considering the turbulence contribution. IConan et al.l(ll995f) 
derive the asymptotic laws for single turbulence-layer OPD 
spectra 



S(f) 




0.2 v/B > f, 
0.2v/B<f <03v/D, 
0.3 v/D < f 



(53) 



(pure Kolmogorov turbulence, i.e. with infinite outer scale), 
where v is the wind speed parallel to the baseline, B is the 
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baseline, and D is the diameter of the telescopes. The first two 
power laws in Eq. d53T > are clearly observable in OPD spec- 
tra ob tained with the Mark III interferometer dColavita et al.l 
Il987l) . In this case, the steep /~ 17 ^ 3 cut-off is not observed 
because of to the small telescope diameters (high cut-off fre- 
quency). Nevertheless, the /~ 17/3 law, which is also expected 
for mono-layer tip-tilt spectra, is also in practice not observed 
on large telescopes. T he tip-tilt power spect ra of the AO-system 
VLT/NAOS, used by iMeimon et al] d2010h to verify the spec- 
tral identification method (MPFK method, Sect. 14.31 ). are well- 
modeled by an AR(2)-model spectrum (lacking a very steep cut- 
off). In addition, OPD spectra obtained with the VLTI/PRIMA 
fringe tracker using the ATs (D = 1.8 m) are compatible with 
an / ~ 8/3 power law up to 2 00 Hz, where a flat noise tail takes 
over (Sahlman n et al.ll2009l) . We thus conclude that, in real-life 
situations (multi-layer turbulence, integrated C^-profiles, multi- 
directional wind), the f~ 11 ^ law is not observable/observed. 

We therefore decide to use the OPD-spectrum model in 
Eq. (l53ll. but d i scard the f~ 11 ^ cut-off. A very similar model is 
used bv lGittorJ d2010l) in the context of VLTI instrumentation. In 
TableQ] we present the values assigned to the parameters, which 
are representative of the observing conditions at the VLTI. Each 
piston sequence is scaled to an rms deviation of 10 |im, such 
that the resulting OPD sequence has an rms value of ~ 14 (j.m 
(10 V2 yum). The theoretical turbulence profile is shown in Fig. 2] 



7.2. Flux dropouts and measurement noise 

In addition, we need to introduce the phenomenon of varying 
flux in the simulations, which manifests itself as variable mea- 
surement noise. This means that, in parallel to the simulated 
OPD sequences, we need to simulate a time sequence of instan- 
taneous noise characteristics, and add a random noise-realization 
sequence to the noise-free OPD sequence. The measurement er- 
ror cr[i on a baseline ij is inversely proportional to the mean 
observed visibility V'-' and the S/N p'' on that baseline, where 



V< 



and 



N< + N' 



p' J = 



N' +N J 



\[n* +NJ + 4 • RON 2 



(54) 



(55) 



Here, RON is the read-out noise per pixel and N' is the num- 
ber of coherent photons arriving from telescope i for a single 
phase measurement. The factor 4, on the other hand, is due to the 
phase sensing for GRAVIT Y being based on 4-p ixel measure- 
ments (ABCD algorithm, see lShao & Staelinl 19771) . For simplic- 
ity, we only consider OPD estimation using phase-delay estima- 
tion. Phase-unwrapping techniques for resolving the 2k ambi- 
guity in measured fringe phases, for example using a combina- 
tion o f phase-delay and group-delay algorithms (e.g. lBlind et al.l 
1201 lh . are postponed to future work 

In terms of the number of coherent photons N arriving per 
exposure on one telescope, we have 



; 1 1 ~- 

N' = -• -TN, 

3 5 



(56) 



where T is the total coherent throughput (AO-corrected atmo- 
spheric turbulence, delay-line system and GRAVITY). The fac- 
tor | comes from the splitting of arriving photons on the three 
baselines corresponding to the telescope, while i refers to the 




Fig. 5. Ten-second extract of a normalized throughput sequence 
for one of the pistons. 



dispersion of fringes on five spectroscopic channels (needed for 
the group-delay estimation, not considered in the current simu- 
lations). 

The variability in the coherent throughput T is simulated 
based on a model for the propagating tip-tilt errors in the AO 
system, thus simulating improper fiber injection. Basically, we 
assume that 



r = 7oexp(-0 2 ). 



(57) 



The quantity <p is the relative offset of proper fringe injection 
(the ratio of tip-tilt offset to the fiber mode field radius). On the 
basis of tip-tilt measurements at the VLTI, the temporal power 
spectrum for (p is derived to be 



flog/ for 2Hz</<8Hz, 
S (/)<*i -log/ for 8Hz</<50Hz, f± 18.1 Hz, 
[6(f) for /= 18.1 Hz 



(58) 



(P. Gitton, private communication). The total rms tip-tilt devi- 
ation is scaled to 14.6 mas, of which 5 mas is present in the 
vibration component. A deeper discussion of this power spec- 
trum would lead us too far for current purpo ses, and will be pre- 
sented in future work (IChoquet et al.l 120121 in preparation). An 
extract of a simulated normalized throughput sequence is shown 
in Fig. |5] 

Since phase delays estimated on the five spectroscopic chan- 
nels can be combined, the estimator for the measurement error 
is derived to bqj 



,1 



2 \lN> + NJ + 4 • RON 2 
5 2 y/N'NJ 



(59) 



dHouairi et al.l l2008). The parameters in this and the above equa- 
tions are specified in Table Q] and are representative of the astro- 
metric observation of the Galactic center, which is the principal 
science case of GRAVITY. At maximal throughput (T = Tq), 
we derive a coherent photon number per exposure of N' = 20 
and a corresponding measurement error of crl - 68 nm. 



7 Note that, during operation, the number of photons per aperture 
can be directly estimated from the six ABCD measurements (ABCD 



photometry, see lBenistv et al.ll2009h . 
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Table 1. Simulation parameters, representative for the Galactic - 
center observation of GRAVITY. 



Observation & hardware 




star magnitude K 


10 


central wavelength A 
telescope diameter D 
baseline B 


2.22 \im 
8.2 m 
80 m 


sampling frequency / 
tracking run 
# spectral channels 
fringe-sensing algorithm 


300 Hz 
100 s 

5 
phase delay 


Disturbances & noise 




rms piston turbulence 

wind speed v 

# vibrations per baseline 


10 urn 

15ms- 1 

8-10 


rms piston vibration 


200-240 nm 


max. coh. throughput 7~o 
rms tip-tilt deviation 


0.007 
14.6 mas 


RON 


6e- 



7.3. Turbulence-only simulations 

Before considering simulations with the complete set of ex- 
pected OPD perturbations (turbulence and longitudinal vibra- 
tions), we perform a number of fringe-tracking runs including 
only the effect of turbulence (but with flux dropouts). The simu- 
lation process is started by generating a sequence of 2000 POL- 
data points. To keep the focus of our simulations on investigat- 
ing the principle and performance of Kalman-filter fringe track- 
ing, we chose to construct POL data in a simple way by adding 
measurement errors (Eq. (l59l ) to a turbulence OPD sequence. 
In complete and fully realistic simulations of the fringe track- 
ing, the simulated POL data should be acquired in a frame- 
by-frame simulation with the phase-sensing algorithm. As men- 
tioned before, this pr oblem will be addressed in our future work 
(IChoquet et al.ll2012l in preparation). 

Following the operation scheme defined in Sect. [6] we sim- 
ulate 200 full fringe-tracking runs of 100 s for both the modal 
and the OPD Kalman filters. To test our definition of the instan- 
taneous Kalman gains when handling flux dropouts, we distin- 
guish between simulations without modifying the gains (F, fixed) 
and with instantaneous gains (I). 

Once the fit parameters are obtained and the Kalman matri- 
ces are generated, both recombination schemes can be tested. 
The fringe tracking is initiated with an empty state vector 
(xo|-i = 0). The convergence of the Kalman filter to the actual 
state of the system typically takes a few to a few tens of track- 
ing points. We reject this transition region when calculating the 
residual OPD. 

The results of the turbulence-only simulations are shown in 
Fig. [6] In this plot, all residual rms OPD values (6 per run) are 
grouped in blocks of 2 nm, giving us a measure of the distri- 
bution of residual OPD values. We found that the performance 
of both control schemes (modal and OPD) in both gain settings 
(fixed and instantaneous) is similar. The residuals are distributed 
with some degree of asymmetry around a nominal residual value 
in the range of 125-145 nm. The total tracking residuals of 100-s 
simulations are thus only about twice as large as the measure- 
ment error cri at maximum throughput for a single OPD ob- 
servation (68 nm, see previous section). This indicates that the 
tracking performance of the Kalman-filter schemes is good. For 
both filters (modal versus OPD), the gain modification slightly 
reduces the residual OPDs, by about 3-4 nm (about 5-6 % in en- 
ergy). 
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Fig. 6. Distribution of residual OPD values resulting from 200 
simulations with turbulence as the only OPD disturbance. The 
plots show the number of residuals per block of 2nm. Both 
Kalman-filter schemes are considered, with instantaneous (I) and 
fixed (F) gains. 



7.4. Full perturbation simulations 

We now turn to simulations based on the full power spectrum in 
Fig. [4] We put a maximum of 40 on the number of vibrations 
that can be identified during the mode identification, and 20 per 
baseline for the OPD identification. These numbers correspond 
to a maximum matrix size of 240 x 240 for Zoo. 

As in the case of the turbulence-only simulations, we used 
2000-point data sequences for the identification. The result of 
a typical spectral fit is shown in Fig. [7] By definition, the aver- 
age mode spectrum contains all vibration components. This is 
clearly apparent in the upper plot, where the number of compo- 
nents to identify is about twice as high (a single baseline has the 
vibrations of only two telescopes). Another thing to note is that 
the statistical spread of the average modal periodogram is indeed 
lower than that of the OPD periodogram. 

A total of 200 full simulations of identification and 100-s 
fringe-tracking run were performed. The corresponding distri- 
butions of OPD residuals are shown in Fig. [8] organized into 
blocks of 10 nm. The addition of vibration perturbations has in- 
troduced a considerable spread in the residual OPD values. In 
addition, unlike in the turbulence-only simulations, a clear dif- 
ference in performance between the modal and the OPD scheme 
is found. The residual rms value for the modal scheme is about 
290 nm, on average, where about 30 % of the measurements are 
above 300 nm. The OPD scheme, on the other hand, has an av- 
erage of about 240 nm for the rms OPD value, and only 6 % of 
the residuals surpass 300 nm. When choosing between fixed or 
instantaneous gains, there is no apparent significant difference. 
We describe these observations in greater detail in Sect. 17.51 

Since the vibration perturbations considered are on the level 
of 200-240 nm, the corresponding OPD vibration perturbations 
are roughly 300 nm (e.g. 220 "V2 nm = 311 nm). Comparing the 
smallest residuals of turbulence-only simulations (~ 125 nm) to 
the best residuals of the current simulations (~ 200 nm), our 
simulations indicate that as much as 75 % of the vibration en- 
ergy can be filtered out by our Kalman filters (we calculate that 
(200 2 - 125 2 )/3 1 1 2 = 25 % of the vibration energy is left). 
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Fig. 7. Result of applying the spectral-fit method to an average 
mode periodogram (top) and one of the six OPD periodograms 
(bottom). The statistical spread in the periodogram points is 
clearly smaller than in the average periodogram, in other words 
the power spectrum is more accurately estimated. 



7.5. Discussion 

The above simulations give a first quantitative result for the 
Kalman-filter schemes developed in this work. We have illus- 
trated the operation of the spectral fit on realistic perturbation 
profiles, and have found the performance one would expect from 
the modal and OPD control schemes. 

When considering our definition of instantaneous gains, it 
is clear that no significant differences between fixed and instan- 
taneous gains are seen, for the considered flux perturbation se- 
quences. One conclusion that might be drawn is that the instan- 
taneous gains do not improve the fringe-tracker performance. 
To verify that this would be the case, we performed simula- 
tions without the flux dropouts, for which we added the re- 
sults to Fig. [8] Surprisingly, although there is some apparent 
difference, the global pattern in the distribution of OPD resid- 
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Fig. 8. Distribution of residual OPD values resulting from 200 
simulations with full perturbation contribution. The plots show 
the number of residuals per block of 10 nm. Both Kalman-filter 
schemes are considered, with instantaneous (I) and fixed (F) 
gains. In black dashed and dotted lines, we overplot the corre- 
sponding distributions for simulations without flux dropouts. 



uals seems to be rather unaffected by the flux dropouts. At this 
point, it might be expected that the Kalman-filter based control 
scheme is not limited by the expected fluctuations in flux. Future 
comparison to other control strategies would be required to de- 
termine whether this property is exclusive to Kalman filtering. 
Although the instantaneous-gain definitions do not change the 
Kalman-filter performance for the considered flux perturbations, 
we assume that the gains are still useful. When poorer observing 
conditions prevail, causing longer flux dropouts for instance, the 
gain can still be expected to limit the risk of improper tracking. 
In this respect, using instantaneous gains might be very advan- 
tageous to the fringe tracking on the ATs, which lack a higher- 
order AO correcting system. Additionally, when lower fluxes are 
assumed for the observed object, the gain modification becomes 
useful. 

The main observation of our simulations is that the OPD 
Kalman filter provides the best results. This is somewhat un- 
expected, since the modal scheme has the most reliably esti- 
mated power spectrum to model, and has the optimal number of 
variables to track. One of the main disadvantages of the modal 
scheme, however, is that all vibrations are fitted at once. As a 
result, the periodogram fit is more difficult: 

- A higher fraction of the periodogram points are part of a 
vibration peak, thus fewer points are available to estimate 
the turbulence component. Since the latter component is the 
most important perturbation, a good estimation of its associ- 
ated power spectrum is essential. 

- Vibration peaks place a much stronger bias on the estimation 
of the turbulence component. The reason is that the statistics 
of an average mode periodogram are "narrower" (i.e. have 
a smaller spread, see the discussion in Appendix |B), hence 
vibration peaks have a stronger influence on the turbulence 
estimation than in the case of an OPD-periodogram fit. 

- The more peaks that are present in the periodogram, the 
higher the probability that the peaks will overlap. Owing to 
the insufficient resolution of the periodograms, overlapping 
peaks will be fitted as a single broader peak, which is more 
difficult to correct (more transient behavior). 
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All of these properties might lower the quality of the power- 
spectrum estimation for the modal scheme, hence lower the 
Kalman filter performance. 

Another property that might have been incorporated to ex- 
plain the performance differences between modal and OPD con- 
trol is that flux dropouts have different implications for the 
schemes. We recall that a low level of flux at one telescope can 
significantly affect the quality of all modes, whereas three of the 
six OPDs are well-measured in the OPD scheme. However, since 
simulations with flux dropouts give similar results to simulations 
without, the performance difference cannot be attributed to flux 
dropouts. Temporally losing information about all mode observ- 
ables thus does not significantly influence the tracking perfor- 
mance of the Kalman filter. In contrast, for non-predictive con- 
trol algorithms applied to the modes, losing observables can be 
expected to be a critical limitation. 

When considering the actual shape of the distributions in 
Fig- El we find a quite large spread in the residual OPD values. 
Not taking into account local statistical jitter, the distributions 
have a clear global maximum with some "leaking" to higher 
residuals (~ 10-15 %). We ascribe these outliers to improper es- 
timation of the power spectra of the perturbation components. 
Improvements to the identification method could indeed be made 
to make it more robust (e.g. optimal grid construction, reconsid- 
ering the number of identification points). 

As a final note, it cannot be ruled out that the performance 
difference is related to the conceptual difference between the 
Kalman filters. For a vibration acting on one telescope, the three 
baselines that are not associated remain unaffected from this vi- 
bration. In the modal scheme, however, the vibration needs to be 
filtered in all modes. It might be this property of indirectly af- 
fecting the other telescopes (via the modal filtering) that causes 
the lower average performance of the modal scheme. More tests 
are needed to verify this reasoning. 

7.6. Performance with respect to GRAVITY 

The main conclusion of these simulations is that the average 
performance of the Kalman-filter control schemes is compati- 
ble with the p erformance requ ired for GRAVITY, specified at 
300 nm dGillessen et al.ll2010t) . For the OPD scheme, at least 
90 % of the performed tracks should attain this stability level, 
and a nominal performance of A/ 10 is within reach. This makes 
the latter scheme clearly the most suitable of the two. 

The most important working point to reach this stability level 
for our Kalman-filter schemes is optimizing the operation of the 
identification method. In this respect, a decent implementation 
should limit both the risk of low-quality identification and the 
calculation time. For the latter point, an optimal adaptation of 
the parameter grid to th e typical prevailing d isturbances at the 
VLTI should be defined dChoquet et alj|2012l in preparation). 

Compared to the simulations done in IChoquet et alj (1201 Ol) . 
based on a classical integrator controller, our simulations show 
that the nominal Kalman-filter performance might be compati- 
ble or better. In any case, we should immediately nuance this 
comparison, since the assumed parameters and perturbation pro- 
files are rather different than the ones used here. For example, 
the simulated turbulence profiles used here exclude the unob- 
served /~ 17 ^ 3 part and a higher and more realistic vibration con- 
tribution is assumed. Our future work will present a firmer base 
for comparing Kalman-filter fringe tracking to classical control, 
something that is beyond the reach of the current work. 



8. Summary 

In the framework of new-generation interferometric instruments, 
we have designed and analyzed a Kalman control scheme for 
fringe tracking. The starting point of this work was the work 
done in AO control, in particular a control scheme proposed by 
iLe Roux et alJ d2004l) based on Kalman filtering, which is an op- 
timal data processing algorithm. Apart from the usual correction 
for turbulence disturbance, a Kalman filter allows us to correct 
for vibration disturbances, present in VLTI observations under 
the form of longitudinal vibrations. 

We started by formulating a control scheme for two- 
telescope fringe tracking, based on a parametric model for the 
fringe-tracking disturbances. To estimate the involved parame- 
ters, we included an identification metho d based on the charac - 
terization of perturbation power spectra dMeimon et al.ll2010T) . 
As a direct application to multi-telescope fringe tracking, we ex- 
tended the fringe-tracking formalism to four-telescope control 
schemes. We note that our approach can easily be generalized to 
n-telescope fringe tracking. 

Two Kalman-filter control schemes are considered in the 
four-telescope case: a modal-based scheme, where three modes 
(linear combinations of pistons) are tracked in a single three- 
component Kalman filter, and an OPD-based scheme, where six 
OPDs are tracked using six Kalman filters. The advantage of the 
former is that only one spectral fit needs to be performed: the 
three modal periodograms can be averaged. To ensure that the 
control schemes are unaffected by low S/N baselines and low 
fluxes at the level of the telescopes (flux dropouts), we included 
a weighting step and a modification to the gain. 

Simulations performed in the context of the four-telescope 
beam combiner VLTI/GRAVITY allowed to test the two control 
schemes and the concept of instantaneous gains. Two important 
observations that follow are: 

1 . Despite being operationally heavier, the OPD control scheme 
provides the best results. We (partly) attribute the lower per- 
formance of the modal control to the more difficult identifi- 
cation (all vibration parameters need to be extracted at once). 
There might also be a conceptual performance limitation re- 
lated to Kalman filtering of the modes, in that all perturba- 
tions need to be corrected in all modes. 

2. For the considered flux perturbations, the instantaneous 
gains do not significantly improve the fringe-tracking per- 
formance. 

When considering the latter point, comparison to simulations 
without the flux dropouts have indicated that Kalman-filter con- 
trol seems to be rather unaffected by the considered flux per- 
turbations. Yet, the instantaneous gain might prove useful when 
worse observing conditions prevail (e.g. fainter tracking source, 
longer flux dropouts). 

As far as the science case of VLTI/GRAVITY is concerned, 
both schemes show nominal performances that meet the spec- 
ified performances. The results open perspectives for attaining 
even higher quality residuals (A/ 10 performance). The most im- 
portant working point for the future is optimizing the identifica- 
tion method. 
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Appendix A: Derivation of the modal state-space 
model 

The derivation below is based on the state-space model in 
Sect. |2] We use a natural intermediate state-space model to de- 
velop the modal model. 



A.1. Canonical state-space model 

Before going to the complete set of four telescopes, we recast the 
problem of a two-telescope interferometer (Sect. |2} in a more 
useful form. The result allows a direct generalization to an n- 
telescope interferometer. 



A.1.1. Two-telescope interferometer in a piston scheme 

In two dimensions, piston disturbances and associated OPD 
measurements can be written like P = (P° P l ) T and (trivially) 
OPD = (OPD 01 ). Finally, the matrix M that governs the conver- 
sion P -^ OPD is 



M = (-l 1). 



(A.1) 



We define a new state-space model in terms of fundamental 
piston disturbances P (rather than the phase disturbances <p in 
Sect. 13. For instance, a vibration on the level of piston P k 
(k = 0, 1) can be written as 



}/:,vib _ k.vib pk,vib , „£,vib j^k.vib 



+ a~ 



P'-" + V 
n—\ 



i.vib 



where we note that we simply include one extra index, k, to re- 
fer to the considered piston. In this state-space model, new state 
vectors x are defined as 



X « — V X « X «J ' 



(A.3) 



where, taking e.g. two vibrations on piston and one on piston 

1, 



o 



3 0,tur 
jl.tur 



/pU,tur pU.tur pU.vib 1 pU.viD l pU.vibZ pU,viDZ\ 
V n n-\ 1 n-\ n n— 1 / 



pl.tur pl.vibl p 



0,vib 1 

1 
l.vibKT 



,0,vib2xT 
1 



X 1 = (P U 

The full equation of state can then be written in block form as 



+ i 



x„-i 



A 
A 1 



+ 



(A.4) 



where block diagonal matrices A' are defined exactly as before 
(for each fundamental piston). We assume that the system noise 
v is composed of independent white Gaussian noise components, 
hence the covariance matrix Z v is still diagonal. 

Given that we consider the same two-telescope interferome- 
ter as before, the observable is still the same: the residual OPD 
y 01 (note that we give the index "01" to refer to the baseline 
on which we work). The new state vector requires an updated 
definition of the system matrix C, which we denote C. An ap- 
propriate definition is 



C = (-C° C 1 ) = (-1 1) 



C° 
C 1 



= M 



C° 
C 1 



(A.5) 



where C° and C 1 are row matrices defined as before. For exam- 
ple, the corresponding C° and C 1 to Eq. (IA.3I > are 



(A.6) 



C° = (0 1 1 1) and C 1 = (0 1 1). 



In accordance with the definition of the matrix A in Eq. ( IA.4t . 
we have defined C in Eq. (IA.5I > as the block-diagonal matrix of 
matrices C. Considering the piston control, we wish to calculate 
the commands to be applied directly to the pistons. Command 
vectors are hence of length two (we consider two actuators). The 
full measurement equation in its new form is then 



01 



(A.7) 



The equation of state in Eq. ( IA.4t in now trivially generalized to 
the four-double equivalent 



3^=MCx n -Mu„- 2 + <. 



A.1.2. Four-telescope interferometer in piston scheme 
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o o) 
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X 1 
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V 1 
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A 3 j 




V 3 




wV 



(A.8) 



We still assume that the covariance matrix I,, is diagonal, i.e. the 
vectors vj; (k = 0, 1, 2, 3) have independent white Gaussian noise 
components. The fact that fringes are tracked on six baselines 
leads to the use of residual OPD vectors y and measurement 
noise vectors w, where 



(A.2) y = (y 01 y 02 y 03 y 12 y 1 V 3 ) T , 



(A.9) 
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and analogously for w. The system matrix C and the vectors u„ 
also have their four-dimensional equivalent, and the full mea- 
surement equation finally reads 



y„ = MCx„ - Mu„_2+w„. 



(A. 10) 



As we did with E v , the measurement error covariance matrix Z H 
is assumed to be diagonal. 

A.1.3. Problems with the state-space model 



The combination of Eq. dA.8t and Eq. dA.lOt forms a very natu- 
ral full state-space model of a four-telescope six-baseline fringe 
tracking system. On the one hand, it describes in a natural way 
the disturbances on the level of the four apertures, under the rea- 
sonable assumption that these are decoupled. On the other hand, 
the measurement equation describes how the six measured resid- 
ual OPDs are the result of the contributions of the four funda- 
mental pistons, and of the correction that is applied by the four 
actuator systems (piezos + delay lines). Owing to the very nat- 
ural description given by this model, we can refer to it as the 
canonical state-space model. It seems that we have found an ap- 
propriate input model to build a Kalman filter. 

However, there is a serious difficulty of the canonical model: 
it is impossible to identify its parameters. In Sect. [3] the pa- 
rameter identification method that is introduced is based on ap- 
plying a spectral fit to sequences of (pseudo-)open loop mea- 
surements. Here, however, the matrix M couples all information 
about the evolution of pistons, in a non-trivial way. In essence, 
the problem is that absolute piston information cannot be un- 
ambiguously recovered from observations. Difficulties with the 
system identification for a piston-based state-space model could 
hence have been expected. However, the state-space model de- 
fined above now forms a perfect starting point for the construc- 
tion of a modal-based state-space model. 

A.2. Modal state-space model 

We now consider how the canonical state-space model can be put 
into a modal form, in the spirit of what was done in Sect. I4.ll 
This direct derivation of a modal state-space model is more 
reliable than simply applying one Kalman-filter controller per 
mode. In particular, although the modes are orthogonally de- 
fined, they have coupling terms that are related to there being 
linear combinations of pistons/OPDs. The power of the Kalman 
filter is that it is constructed to deal with vector quantities and 
covariance matrices which describe coupling between different 
terms. 

Modal equation of state. For the new equation of state, we in- 
troduce a piece of new notation. We use the tilde sign (~) to 
denote extended vector and matrix quantities with respect to the 
previously introduced versions. First we introduce the extended 
state-space vectors x„ as 



where e.g. 



(A.ll) 



x^ = (0° 1 x 2 3 ) T , {0 j = Ox/, for ; = 0,1,3). (A.12) 

In words, the components x' n consist of four zero-columns of 
which the k-th zero-column has the same length as the vector xf 7 
(k = 0, 1, 2, 3), but the j-th zero-column is replaced by x' n itself. 



It follows that, by definition, all x' n have the same length (unlike 
the vectors x' n ), and we define 

r = length of vectors 3? n . (A. 13) 

In exactly the same way as in Eq. ( IA. lib , we define the vectors 
~v n as extensions to v„, and the corresponding covariance matrix 
is written as Z-p. Finally, we define the matrix A as 



A = 



(A 0^ 
A 
A 
AJ 



(A. 14) 



i.e. a block diagonal matrix with the square system matrix A on 
the diagonal. A completely equivalent equation of state to the 
one in Eq. ( 1A.8I ) is then given by 



x„+i 



Ax„+v„. 



(A.15) 
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To pass to a modal equation of state, we follow an approach 
similar to that used in the Q-mode definition Q = V T P. We 
define a (still unitary) block-matrix version of the matrix V 



(A. 16) 



where "1 " denotes a unit r x r matrix. The modal state vector is 
then defined as 

x G ,„ = V^x„. (A. 17) 

Multiplying Eq. ( IA.15b by VJ and using the equality A = 
Vp AV P , the modal equation of state finally reads 

x e . n+ i = Ax G ,„ + v Q ,„, (A. 18) 

where we have defined the mode system noise 



VQ,n = Ml 



(A. 19) 



The covariance matrix of Vg is Ig v = VJI-pVp. Using some 
numerical examples, it can be shown that Ig jV has a quadruple 
diagonal structure, i.e. four equal diagonal blocks, but also pos- 
sesses off-diagonal terms. 

Modal observation equation. We multiply the observation 
equation by H (Eq. (IBTb). which leads to 



y&„ = H y„ = R V T Cx„ - R V T u„_ 2 + H w„, 



(A.20) 



where R = diag(l, 1,1,0). We note that we used the property 
H M = R V T . We verify that this equation indeed defines the last 
component of yg,„ to be zero, in accordance with our conven- 
tion that <2 3 = 0. The defined transformation gives the observing 
equation in terms of the residual Q-mode observables yg. Next, 
we need to pass from the piston state vector x„ to the equivalent 
Xg„ for the modes. For this, we define an extended version of 
the matrix C asn 



(C- 0^ 

C_ 

C 





where C = (C° C 1 C 2 C 3 ). 



8 The horizontal bar indicates that C_ is a row vector built of row 
vectors, and allows us to distinguish from the previously introduced C 



matrix (Eq. dA.5t ). In the principal text, we drop this bar, for notational 
simplicity. 
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(A.21) 

It takes some pen and paper work to show that one can rewrite 
the first term in the right hand side of Eq. (1A.20) as 



RV T Cx„ = Cxn,„- 



(A.22) 



Two final definitions are the modal command and the mode mea- 
surement noise 



ug,„ eRV t u„ and w a „ = H w„ 



(A.23) 



(for both, the last component is 0). The full modal equation is 
finally given by 



mixed. As a solution, we can first extract the measurement 
errors of the 6 OPD periodograms. Alternatively, we can 
simply combine the measurement errors estimated by the 
phase sensor (see Eq. d59l l) into a global measurement error 
(one for each baseline). We note that the measurement noise 
of the average modal periodogram is calculated by summing 
the diagonal of Zg iH , and dividing by three (i.e. averaging 
over the three modes). 



Appendix C: Instantaneous-gain definition 

The calculation of an arbitrary Kalman gain matrix Goo (e.g. as 
in Eq. ( fT8l l) is of the form 



y a „ = Cx a „ - u a „- 2 + w a „. (A.24) Gco = Z CO C T (CZ 00 C T + I w ) 

We note that the covariance matrix of Wg is Sg, w = H Z„. H T . 



(CI) 



Appendix B: Details of four- telescope identification 

We denote by P y the periodogram of a POL-data sequence on a 
single baseline, i.e. as in Eq. ( l23"T l. In addition, we define S as the 
power spectrum associated with the periodogram P y . Under the 
assumption of Gaussian white noise excitation, it can be shown 
that, for fixed frequencies /, the distribution of P y (f) asymptoti- 
cally approaches an exponential distribution of expectation value 
S (/). In other words, for long identification sequences, the like- 
lihood function L of P y is close to 



«*■*- n^>m 



(B.l) 



where A denotes all model parameters of S . 

Averaging three exponential distributions leads to a Gamma- 
Erlang distribution of order three. Hence, denoting by P y the 
averaged periodogram, the periodogram likelihood function in 
Eq. ( IB. II ) is modified into 




(/>>'(/))" exp 



3P y (f) 



s(f) 



(B.2) 



At a fixed frequency /, the expectation value of the averaged pe- 
riodogram remains the same, whereas the standard deviation is 
reduced by a factor "V3. The smaller standard deviation is the 
major advantage of periodogram averaging: the power spectrum 
is more accurately estimated, and even low vibration peaks be- 
come significant. The spectral-fit method then requires the fol- 
lowing three modifications: 

1 . The maximum-likelihood criterion needs to be adapted to the 
likelihood function in Eq. ( IB. 21 ). 

2. The threshold for vibration-peak detection (i.e. peak-picking 
criterion) can be lowered, since lower peaks become signifi- 
cantly detectable. More concretely, we choose the detection 
criterion P y {f) > 4 5 (/) to detect vibration peaks (probabil- 
ity of 0.05 % per point for false detection), rather than the 
detection criterion P y (f) > 7 S (/) for non-averaged peri- 
odograms (probability of 0. 1 % per point for false detection). 

3. A more subtle point is the characterization of the 
measurement-noise covariance matrix Ig w (see Eq. (l36ll). 
In the standard MPFK method, applicable to single-baseline 
data, the measurement error is determined from the hor- 
izontal tail of the POL-data periodogram. In modal pe- 
riodograms, the information of the different baselines is 



where Zoo is calculated by solving a Riccati equation. It is clear 
that the gain depends on a measurement covariance I w . Every 
Kalman step should ideally involve the calculation of a new 
gain, to compensate for the variable measurement noise (flux 
dropouts). As this would be a heavy computational burden, we 
propose adding an artificial term to the gain. 

It is clear from Eq. (IC.U that the gain is inversely related to 
the Z„ (neglecting that it is also used to calculate Zoo). It then 
makes sense to define an instantaneous gain Gco, n as a quantity 
of the form 



*3oo,n — /-/t- \ ° 



(C.2) 



where / is a real function. The matrix Z lv „ is defined as the in- 
stantaneous measurement covariance at time step n. For flux of 
the usual level of noise, the definition of Goo, n implies that we 
are tracking using the optimal gain Goo (the multiplicative term 
is 1). Whenever the instantaneous measurement noise increases, 
the multiplicative factor must ensure that the gain is decreased. 
In the extreme case when the measurements are very noisy and 
hence unusable, / can be chosen such that the gain approaches 
0. In that case, the Kalman filter is based completely on the de- 
terministic part of the evolution. 

For the modal and the OPD Kalman filter, we take the 
instantaneous-gain definitions as in Sect. 15.21 It is important 
to understand the difference between the modal and the OPD 
scheme that led to Eqs. ( f49b and d50l l. In the modal scheme, 
each mode estimate is limited by the noise on the worst OPD 
triplet (by triplet, we mean three baselines that are connected 
to the same telescope). This is because these modes are de- 
fined as mixes of OPDs associated with all telescopes. A first 
order multiplicative correction of the gain thus preferably con- 
tains a measure for all error terms, hence we consider the trace 
of the covariance matrix, neglecting the off-diagonal terms. For 
the OPD scheme, however, each Kalman filter tracks a variable 
in an independent way. The gain of the different Kalman filters 
can therefore be adapted purely as a function of the noise on the 
corresponding variable. Again neglecting the coupling terms, the 
gain associated with a baseline ij is modified by the correspond- 
ing term in the error covariance matrix. In this way, we avoid that 
the flux problems for one telescope propagating to the tracking 
on baselines not associated with that telescope. 

We remark that the discussion that led us to Eqs. (|49J and 
(l50l is probably not the end of the story. It is possible that more 
elegant "instantaneous gain" definitions exist. Yet, these defini- 
tions would likely have a similar form to our choice. 
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